feat(evaluators): new api from runners-api#242
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
|
||
| scorer_label: str | None = Field( | ||
| default=None, | ||
| scorer_id: str = Field( |
There was a problem hiding this comment.
This backend schema now requires scorer_id, but the Luna UI still accepts label-only/version-only configs and sends scorer_id: null. That creates a save-time validation error. Can we update the UI validation/copy to require scorer_id, or keep the backend one-of identifier contract?
There was a problem hiding this comment.
@abhinav-galileo Enterprise UI fetches the scorers using /v1/scorers?scorer_type=preset&scorer_type=luna . This brings along scorer_id and scorer_version_id. So UI resolves the selected Luna scorer before saving. The visible field in UI is label-oriented, i.e. we show the scorer label on the screen, but selecting it populates hidden scorer_id and scorer_version_id, so Agent Control receives the canonical scorer ID.
Given that, I think we should keep the backend contract as scorer_id required and not restore label-only/version-only support in Agent Control. Accepting label-only in AC would require AC to perform scorer lookup/resolution itself.
| configuration to the direct Luna scorer fields (`scorer_label`, `scorer_id`, or | ||
| `scorer_version_id`, plus `threshold` and `operator`). If you still need the | ||
| legacy Luna2 evaluator, pin `agent-control-evaluator-galileo <8`. | ||
| configuration to use the direct Luna scorer fields. `scorer_id` is required; |
There was a problem hiding this comment.
Nit: since scorer_id is now required, can we confirm the rollout path for any saved galileo.luna controls that only have scorer_label or scorer_version_id? A short migration/compat note here or in the PR description would make the break explicit.
There was a problem hiding this comment.
Good call. I’ll make this explicit in the PR notes.
Rollout path: this is a breaking config contract for galileo.luna. Saved Luna controls must include scorer_id; scorer_label is now display/metadata only, and scorer_version_id is an optional version pin. The enterprise UI already resolves the selected scorer before save and persists scorer_id.
Summary
/api/v1/scorers/invoke.GALILEO_API_SECRET_KEYorGALILEO_API_SECRET.scorer_idthe required runtime identity;scorer_labelis optional metadata andscorer_version_idis optional pinning.config: {}and never forwardGalileo-API-Keyto runners-api.Scope
User-facing/API changes:
scorer_id; label-only and version-only configs are rejected.GALILEO_RUNNERS_API_URL,GALILEO_LUNA_SCORER_ID, and internal secret auth.Internal changes:
/scorers/invokeand/internal/scorers/invokeclient routing with runners-api routing.Out of scope:
Risk and Rollout
galileo.lunanow requiresscorer_id.scorer_labelis optional display/metadata andscorer_version_idis an optional version pin. Any saved Luna controls that only havescorer_labelorscorer_version_idmust be re-saved through the UI or manually updated to include the resolvedscorer_idbefore they can be evaluated with this version. The enterprise UI already resolves the selected scorer and savesscorer_id.scorer_id, migrate those controls or temporarily roll back before retrying cutover.Testing
make check(targeted package validation was run instead)Targeted validation:
make -C evaluators/contrib/galileo test-> 86 passedmake -C evaluators/contrib/galileo lint-> passedmake -C evaluators/contrib/galileo typecheck-> passedgit diff --check-> cleanChecklist
scorer_id